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Abstract 


A family of five-stage fourth-order Runge-Kutta schemes is derived; these schemes require only two 
storage locations. A particular scheme is identified that has desirable efficiency characteristics for 
hyperbolic and parabolic initial (boundary) value problems. This scheme is competitive with the 
classical fourth-order method (high-storage) and is considerably more efficient and accurate than 
existing third-order low-storage schemes. 

Section 1: Introduction 

Many problems of mathematical physics result in the numerical approximation of systems of 
coupled ordinary differential equations (ODE’s). The technique for integrating the equations depends 
on, among other things, the desired accuracy, efficiency, robustness, and simplicity. For the ODE’s 
that result from the direct simulation of the equations of fluid dynamics, the overriding consideration 
is often the availability of fast memory. A numerical integration technique that minimizes memory 
storage is essential and can be formulated with a Runge-Kutta (RK) methodology. 

Williamson [1] showed that all second-order and some third-order RK schemes can be cast in a 
2N-storage format, where N is the dimension of the system of ODE’s. He also showed that (except 
for certain exceptional cases that involve specific forms of the function to be integrated) the four- 
stage fourth-order RK schemes cannot be put into 2N-storage format. Fyfe [2] showed that all 
four-stage fourth-order RK schemes can be implemented in 3N-storage locations. In spite of the 
reduced accuracy, for many applications the methods of choice are the 2N-storage third-order RK 
schemes. 

Although the principal motivation of this work is to derive schemes of 2N storage, both the 
accuracy and efficiency also determine the practical utility of any scheme. For example, a fourth- 
order 2N-storage scheme with a vanishingly small stability envelope is of little practical value. The 
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efficiency of an RK scheme is determined by the absolute size of the stability domain relative to the 
number of stages needed to implement the scheme. Typically, RI< schemes that utilize the minimum 
number of stages necessary to achieve a given order of accuracy are used because they minimize the 
absolute workload for a given order of accuracy. Verner [3], however, demonstrated that the 13-stage 
eighth-order scheme is more efficient than the 12-stage eighth-order scheme. Similarly, Sharp et al. 
[4] showed that the efficiency of many existing fifth-, sixth-, and seventh-order RI< schemes can be 
increased by including additional stages while retaining the same order of accuracy. The additional 
stages are used to optimize the absolute stability domain and increase the relative efficiency of each 
stage. 

A five-stage fourth-order RK scheme that satisfies the 2N-storage constraint is sought. The scheme 
should have the largest absolute linear stability envelope and the lowest truncation error possible. 
In section 2, we describe the conventional and the 2N-storage RI< nomenclature. In section 3, we 
derive the complete set of three-stage third-order (3,3) RI< schemes that can be cast in 2N-storage 
format. We then introduce a new set of four-stage third-order (4,3) RK schemes that are 2N-storage 
schemes. In section 4, we derive the new five-stage fourth-order (5,4) RK schemes that require 2N 
storage. In section 5, we compare all schemes for efficiency, accuracy, and simplicity and then draw 
conclusions in regard to the utility of the new schemes. 


Section 2: Runge-Kutta Nomenclature 

We are concerned with the numerical solution of the initial value problem 

“jjT = F[t,U(t ) ]; U(t 0 ) = U 0 

Assume that the discrete approximation is made with an M - stage explicit RK scheme. The imple- 
mentation over a time step h is accomplished by 

h = 
k = 

u 1 = 

where U° = U(t 0 ) and U 1 
formula. 

Butcher [5] developed a succinct nomenclature for writing all RK schemes in terms of the fixed 


F(t 0 ,U°) 

F ( £o + Cih , U° -f h ^2 dijkj 
\ ;=i 

M 

u° + h £ h,k } 

j-l 

— U(t 0 + h) and the fixed scalars a,j, bj , c t - are the coefficients of the RK 
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scalar coefficients a,j, 6j, c,-. The general, M-stage, explicit RK scheme can be expressed 


as 


Cl 



• 

a 2 ,l 


cm 

1 • 



W . 

6 a/ 


i 

The entries in the first column (c,-, i = 1 ,M) are the intermediate time levels. The coefficients a itj 
nre the intermediate weights of each RK stage i. By definition, c, = i a iJ and Cl = 0 for the 
explicit self-starting case. The last row (bj, j = 1 , M) are the weights of the final stage. 

To devise low-storage RK schemes, Williamson and Fyfe exploit the technique of leaving useful 
information in the storage register. Each successive stage is written onto the same register without 
zeroing the previous value. Thus, the M - stage algorithm becomes 

dUi = AjdUj-! + hF(Uj) 

Uj — Uj-i + BjdUj j — l, M 

So that the algorithm is self-starting, A\ — 0. Only the dU and U vectors must be stored, which 
results in a 2N-storage algorithm. 

Williamson [1] first presented the relationship between the general RK scheme and the 2N-storage 
scheme. Specifically, the following equations provide the relationship between the 2N-storage vari- 
ables Aj and Bj and the conventional RI< variables and c,: 


(1) 


Bj = 


(j ^ M) 

Bm = 

6 M 


A, = 

(bj- 1 ~ Bj~i)/bj 

(i ^ 1>6 j ^ 0) 

Aj = 

1 

H 

1 

H 

+ 

<3 

O' 1, 6 j = 0) 


The precise values of Aj and Bj that are required to yield a higher order scheme remain to be 
determined. Third-order RK schemes are the lowest order schemes for which the determination of 
2N-storage is nontrivial. We begin by demonstrating the procedure for finding high-order 2N storage 
RK schemes for the third-order case. 

Section 3: Third-Order Runge-Kutta Methods 


For a third-order Runge-Kutta scheme, at least three stages are required. The general Butcher 
array diagram for the three-stage explicit scheme is written as 


Cl 



C 2 

«2,1 


c 3 

« 3,1 

a 3,2 


6i 

62 63 
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For third-order accuracy, seven constraints between the coefficients a{ j, 6j, and c, must be satisfied 
[5]: 

3 i 3 


£6, = 1 (1) 


t=l 


^ l( 2 ) 3 I (3) 


»=i 


»=i 


f . 1< 3 > 

£ b i a i,j c j = 7 
i,j = 1 ® 


and 


= a iJ * = 1,3 
i-i 


( 2 ) 


( 3 ) 


The superscripts on each of equations (2) indicate the order of accuracy that the constraint equation 
governs. If equations (2) and (3) are solved for the seven nonlinear algebraic equations in nine 
unknowns, then three solutions result, the most general of which involves a two-parameter family. If 
c 2 and c 3 are defined as the free parameters (ci = 0 for explicit RK schemes), then the following is 
true for case 1 (for which c 2 7^ 0,2/3, or c 3 when c 3 7^ 0): 


bo = 


c 3 - i 


6, = r-4 


- C 2 


bi = 1 — 6 2 — 6 ; 


1 


2c 2 (c 3 - Cj) ’ “ 3 - 2 c 3 (c 3 - c) ; 01 - 1 ■ 02 " 03 : ° 3 ' 2 = (4) 

Two additional solutions account for the solution points excluded from case 1. Exceptional solutions 
exist when c 2 = 2/3 or c 2 = c 3 (c 2 = 0 is an uninteresting case). These solutions are governed by 


62 = - 


, _ 1 » 1 

61 " 4 " 63 5 “ 3 ’ 2 = 463 


for case 2 (for which c 2 — |, c 3 = 0 and & 3 is an arbitrary nonzero number), and 


t I.? 3 1 

62 4 ’ 61 = 4" 63 1 ° 3 ’ 2 = 


( 5 ) 


( 6 ) 


for case 3 (for which c 2 — c 3 = | and b 3 is an arbitrary non-zero number). Equations (4) through 
(6) span the complete set of (3,3) RK schemes. 

The linear stability envelope for the (3,3) RK schemes is determined by 


G = l+ (j>)z+ 6.0. ) Z 2 + ( £ 6,a, J c J j Z : 


1 + Z + -Z 2 + -Z 3 
2 6 


where Z is a complex number that is considered in detail later and G is the amplification matrix. 
The coefficients of G are precisely the linear constraint equations defined in equation (2); thus, all 
(3,3) RK schemes have identical linear stability envelopes. The nonlinear accuracy, however, will 
depend on the specific choice of scheme and will be problem dependent. Numerous different methods 
for choosing a scheme based on accuracy criteria are summarized elsewhere [6]. 

Not all solutions defined in equations (4) through (6) can be put into the 2N-storage format. The 
2N-storage condition imposes independent constraints in addition to those in equations (2) and (3). 
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To demonstrate this, we establish the Butcher array form of the 2N-storage scheme. The original 
Butcher array is expanded in terms of the 2N-storage variables Aj and Bj to obtain 


0 


C2 

<*2,1 

C 3 

<* 3,1 <* 3,2 


6l 6 2 63 


0 



B 1 

By 


B\ + B 2 (A 2 + 1) 

[A 2 i?2 + B\\ 

b 2 


[A 2 (A 3 B 3 -f B 2 ) + B\ 

\A 3 B 3 -f- B 2 B y 


From the relationships defined in equations (1), the values a 2 ,i> <*3,2? & 3 , b 2l an d &i are immediately 
determined by the values of B x , B 2 , B 3 , A 3 , and A 2 , respectively. By definition, the conditions 
c * = Hj=i a i,j will automatically be satisfied. The only condition that is not immediately satisfied 
involves a 3( i , or specifically a 31 = [A 2 B 2 + By adjusting the free parameters in each of the three 
general solutions, this final constraint can be satisfied. For case 1, for which C2 ^ 0,2/3, or c 3 and 
c 3 ^ 0, a one-parameter family of solutions results. If we define 

*1 = V ( 36c 2 + 36c 2 - 1354 + 84 c 2 - 12 ) 

z 2 — 2c\ + c 2 — 2 

2 3 = 12c* - I8C2 + I8C2 - llc 2 + 2 

24 = 36C2 — 36 c 2 -f 13 c 2 — 802 4 - 4 

25 = 69 c 2 - 62 c^ + 28 c 2 - 8 

2 6 = 34(4 - 46c 2 + 34c* - 13c 2 + 2 (7) 

then the one-parameter family is given by 


Bi = c 2 

B = ^ C2 ( C 2 ~ 1) (32 2 — Zj ) - (32 2 - Z\ ) 2 

144c 2 (3c 2 - 2 )(cj - i) 2 
B 3 = — 24(3c 2 — 2)(c 2 — l) 2 

(3*2 - zj) 2 - 12 c 2 (c 2 - 1) (3z 2 - z,) 

^ _ — (6c| — 4 c 2 -f~ l)zi -t~ 3 z 3 

(2c 2 + l)zj — 3 (c 2 + 2)(2c 2 — l) 2 
A 3 = ~ z * Zl + ^-08(2 c 2 — l)c2 — 3(2 c 2 — 1)2 5 

24z lC2 (c 2 - l) 4 + 72 c 2 2 6 + 72c 2 (2c 2 — 13) (8) 

provided that none of the respective denominators vanishes. A scheme that both eliminates the 
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square roots and is close to the optimum truncation error was identified by Williamson [ 1 ] as 


0 


1 

1 

3 

3 

3 

3 15 

4 

16 16 


1 3 8 


6 10 15 


A x 


Bx 


A 2 i?2 

-^3 i?3 



153 

128 


8 _ 

15 


( 9 ) 


This scheme, which is widely used, will be referred to as the (3,3) 2N-storage scheme of Williamson 
in the remainder of this paper. 


For case 2 , for which C 2 = §, C 3 = 0 , and 63 is an arbitrary nonzero number, we find a 2 N-storage 
solution for 63 = — - that results in the scheme 



7_ 3 _1 

12 4 3 


0 


2 

3 


1 _3 

9 4 


9 _ 1 

2 3 


For case 3, for 
solution for 63 = 


which C 2 = C 3 = | and 63 is an arbitrary nonzero number, we find a 2N-storage 
| that results in the scheme 


0 


2 2 

3 3 


2 _ j_ 

3 12 


3 

4 


l JL 1 

4 12 3 



Because the 2N-storage schemes are a subset of the general (3,3) RK scheme solutions, they too have 
identical linear stability envelopes. 


To obtain different linear stability characteristics for third-order RK schemes, additional stages 
must be used. One additional stage adds one constraint equation (c 4 = X^=i a^j) to the seven 
nonlinear equations and adds five new variables. The general solution which involves eight equations 
in fourteen variables, is a six parameter family. If 2 N-storage is required, then three additional 
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equations must be satisfied, which eliminates three of the six free parameters. The general four-stage 
third-order (4,3) RK solution that involves 2N-storage can be expressed as a three-parameter family. 
A two-parameter family may be generated by enforcing the linear, fourth-order constraint. None of 
these general expressions are presented here because of their complexity. 

If third-order accuracy is achieved in four stages instead of three, then the required work per 
time step is increased by one third. By enfprcing the linear, fourth-order constraint, the increased 
stability envelope of the (4,3) RK schemes more than offsets this additional work. The (4,3) schemes 
are more efficient per unit time, which is consistent with the work of Sharp et al. [4]. An extensive 
study of the stability domains of the (4,3) RK schemes shows that a nearly optimal scheme can be 
obtained by setting 


G = 1 + (E *,) 2 + (E 6 ,-c,) Z 2 + ^ £ bidijCjj Z 3 4- 

= l + Z+\z 2 + \z 3 +~Z A 
2 6 24 



Note that these schemes are third-order accurate for the nonlinear problem but fourth-order accurate 
for the linear problem. Three third-order schemes that satisfy the 2N-storage constraint in four stages 
and have a nearly optimal stability envelope are now presented. If we set Ci = C3, then 



If we set C3 = c 4 , then 
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If we demand that the intermediate time levels Cj be monitonically increasing, then 



For problems in which third-order temporal accuracy is sufficient, these new schemes should be 
considered because of their larger stability envelope. 


Section 4: Fourth-Order Runge-Kutta 


Twelve nonlinear algebraic equations in fourteen variables determine the solution to the four-stage 
fourth-order (4,4) RK schemes. Five general solutions exist: one that involves two free parameters 
and four that involve one free parameter, covering the special roots excluded by the first. (See 
Butcher for the general solutions [5].) Fyfe [2] demonstrated that all cases can be implemented in 
the 3N-storage format. Williamson [1] demonstrated that they could not, in general, be implemented 
in the 2N-storage format. The 2N-storage constraint adds three additional equations that cannot be 
satisfied by any of the five general solutions. 


By increasing the number of stages, additional degrees of freedom are generated that can be used 
to satisfy all fourth-order and 2N-storage constraint equations. The five-stage fourth-order (5,4) RK 
scheme may be expressed in Butcher and 2N-storage form as 


Cl 


C 2 

« 2,1 

C 3 

^ 3,1 a 3,2 

c 4 

« 4 t l a 4,2 « 4,3 

C 5 

« 5,1 a 5,2 <* 5,3 . 05 ^ 


6 2 63 ^4 &5 


At Bi 

-A 2 

A3 B3 
A4 j£?4 
A5 


For explicit self-starting schemes, c\ = 0, and A\ = 0. For any (5,4) RK scheme, the nonlinear 
accuracy constraints are 


EL, fc = If" ; EL, b,c, = i <2) ; EL, kc] = | P) ; = 

EL ifce?=i W i EL=, IxCiHjCj = | (4> ; EL=, = U 4) I E?j,*=i b,a u a hk c k = 
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and 


5 

c i — a *»i i — 1 ) 5 
i=i 

Again, the superscripts on equations (11) indicate the order of accuracy from which the constraint 
was obtained. Note that the first four are similar to those needed in the (3,3) and (4,3) RK schemes; 
the last four are specific to the fourth-order accuracy condition. These 13 constraint equations involve 
20 unknowns ( five 6*’s, five c,-’s, and ten a t j’s), which represents at least a seven-parameter family 
of schemes. 

The relationships between the original RK variables 6j, c,- and the 2N-storage variables Aj , Bj 


are 

82,1 


Bu 

ci = 0 

83,1 

— 

A2B2 + B\, 

c 2 = B\ 

83,2 

= 

B*, 

C3 = B\ 4- B 2 (A 2 4* 1) 

84,1 

= 

A 2 {A$Bz 4 - B2) 4- B\ ) 

C 4 = B\ 4- B 2 (A 2 4- 1) 4- B 3 [A 3 (.< 4 2 -f- 1) 4* 1] 

84,2 

= 

A3B3 4 - i ?2 , 

C5 = B\ 4- -S 2 (A 2 4- 1) 4- B3[A 3 (A 2 4- 1) 4* 1] 4- B4{A4[A3(A2 4- 1) 4- 1] 4* 1} , 
61 = A 2 {A 3 [yl 4 (A 5 B 5 4- B4) 4 - B 3 ] 4- B2} 4- B\ 

84,3 

= 

Bs, 

85,1 


A 2 [A 3 (A 4 i? 4 -f- B3) 4- B2 ] 4- B 1 , 

^2 =: A 3 [A 4 (A555 4- B4 ) 4- B3] 4* B2 

85,2 

= 

A3IA4B4 4- B3) 4 - B2, 

63 = A4IA3B5 4- B4) 4- B3 

85,3 

= 

A4B4 4- B3, 

64 = A5B5 4- B4 

85,4 

= 

B4, 

65 = b 5 


If the (5,4) RK scheme is constrained to a 2N-storage format, six additional constraint equations 
are produced. Specifically, from the relationships defined in equations (1), the values 02,1, 03,2, 04,3, 
a 5,4, &4, &3> ^2 j and bi are immediately determined by the values of B\, # 2 , #3, B 4 , B$, A5, 

A 4 , A 3 , and A 2 , respectively. (Ai — 0.) Again, by definition, the conditions c t - = Z)] =1 a t -,j are 
automatically satisfied. The constraint equations that are not immediately satisfied are those that 
involve the values a 3| i, a 4>1 , a 4<2 , 054, a 5)2 , and 05,3. These six additional equations yield a system 
of 19 constraint equations in 20 unknowns. Assuming a solution exists, in general, there is a one 
parameter family of solutions. The general answer is not forthcoming due to the complexity of the 
equations. 

We can determine specific solutions to the 19 equations in 20 unknowns. Frequently, we can 
simplify the equations considerably by assuming special forms of the coefficients c, or A,-. The 
algebra simplifies considerably if we assume that c 2 = c 3 . This assumption eliminates the one degree 
of freedom in the general solution but produces an exact analytic solution. The Butcher diagram for 
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this case becomes 


3 ' 3 ' 6 

1 + $ 1+£11 

3 ' 3 ' 6 

1 y2 2 2 / 3 

3 3 6 

1 


2 | ^2 , 2 2 / 3 

3 + 3 ' 6 

_l_ 2^1 i l 
3 ~ 3 “ 2 

2 2 / 3 , 1 
, 6 + 6 
^2 | 2 2 / 3 | 1 

-3 + 3 + . 2 


I 

, 6 '6 

M 4 . til _L I 
3 + 3 + 2 

_2^, I 

« + 6 


_ 1 _ 2 \/2 _ 2 2 ^ 3 
3 3 3 


1 ^2 2 2 / 3 


I 4- ^ 4 £Zi 1 4 ^ 4 SE 2 2 / 3 ^2 1 1 ^2 2^ r , ig , 2 2 / 3 

3+6 + 12 3+6 + 12 6 3 6 6 “ 6 vT 3 + 6 + XT 


The 2N-storage array becomes 


A 1 

A 2 

A 3 

A 4 

A 5 


Bi 

B 2 

B z 

B 4 

b 5 


-1/3 + ^l-iM 
-y/l-i 1 ! 3 — 2 
-1 + ^2 


2/3 + f +4 
-1 f-+ l/ 6 

-I/3-42-4 

1/3 


1/3 + 


M + 

c I 


6 

2 2/3 

12 


Because X3?,j,*,/=i h^ijdjko-kici — ~(^+ the linear stability envelope is found by considering 


G = 1 + Z + iz 2 + \z 3 + “Z 4 - 

2 6 24 


' 1 ^2 2 2 / 3 

^ — 4 

72 72 72 


where Y2ij t k,l=:i ^i a ij a jk^ki c i ~ 20 " ■ A second analytic solution has been obtained by setting all values 
of the Aj = — 1 for j — 2,5. The 2N-storage array for this case becomes 


Ai 

A 2 

A 3 

a 4 

As 


Bi 

B 2 

Bz 

B 4 

B 5 


0 

-1 

64 B 4 2 — 32 Bi+l 

96 B 4 2 — 60 B 4 
64 B 4 3 — 80 B* 2 +25 Bi 

64 B 4 2 -32 B 4 +l 

-1 

-1 

16 fl 4 2 - 10 B 4 

-1 

b 4 

-1 

4 B 4 — 1 
12 B 4 



for which the amplification matrix is 

G = l + z+iz 2 + iz 3 +lz 4 

(The Butcher array is not presented here because of the small stability envelope of this scheme.) The 
coefficient B 4 is defined as B 4 = — - y/X with X = -163_ , g ecause gilt ~ J_ the 

amplification matrix becomes Gw 1 + Z + |Z 2 -f |Z 3 + ^-Z 4 + Z 5 . Neither of the two analytic 

solutions has a desirable stability envelope, as we shall see later. The first is of marginal value, and 
the second is extremely small. However, they are of some value because they provide good initial 
guesses for determining numerical roots. 

In general, we must resort to a numerical solution of the 19 equations in 20 unknowns. The 
ultimate goal is a scheme that has an optimal stability envelope. We can constrain the last degree of 
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freedom in the general system by demanding that the resulting scheme be optimal, which produces 
a system with 20 equations and 20 unknowns. 


To define the optimal constraint, we begin with the amplification matrix for the ( 5 , 4 ) RK scheme, 
which is given by 


- 1 + Z+ j 7,2 + 6 « a ‘V c j j z3 + | S h i a ij a 3 k c k j z 4 + | biaija jk a k ic , j Z 5 




vi,j,k,l= 1 


A particular scheme can be defined as being optimal when its stability envelope encompasses a 
useful portion of the complex plane that is as large as possible for a particular choice of the complex 
matrix Z. This definition of optimality is problem dependent and depends on the structure of the 
eigenvalues of the system of ODE’s being integrated. 


We now focus specifically on hyperbolic and parabolic partial differential equations (PDE’s), for 
which the ( 5 , 4 ) 2 N-storage RK schemes will be advantageous. A model equation that produces 
an eigenstructure representative of the equations of fluid mechanics is U t + aU x = a U xx . If we 
choose spatial operators for the U x and U xx terms, this equation reduces to a system of ODE’s. 
The complex matrix Z can now be interpreted to represent combinations of the inviscid and viscous 
stability characteristics. Figure 1 shows a parametric study of the stability envelope as a function 
of the CFL number ^A = and the viscous CFL number (A = ? as determined from the 

amplification matrix G. Regions toward the origin represent stable regions for the scheme, and the 
upper right portion of the plot represents the unstable regions. The spatial operator used for both the 
inviscid and viscous operators was the sixth-order compact scheme [7]; however, similar conclusions 
could have been reached with a second-order or a Fourier method. (See the appendix for more 
details.) The parameter a is defined as a = T,ij, k ,i = i By adjusting -1 < a < 1, large 

stability envelopes are obtained for values of a « 0 . Figure 2 shows a more refined plot of values 
28o — a - no- Note that all curves have a similar structure but that no one curve simultaneously 
maximizes the inviscid (imaginary axis) and viscous (real axis) stability limits. Further inspection 
reveals that the maximum inviscid stability occurs at a = ^55^; the viscous stability limit occurs 
at a « 244 ■ The value a — ^ provides a compromise between the two extremes. Caution should 
be exercised in using values of a near because the inviscid stability limit in the case of very 
small viscous CFL values may change dramatically with small changes in a. Figure 3 shows the 
stability limits for the ( 3 , 3 ) RK scheme, the ( 4 , 4 ) RK scheme, and three ( 5 , 4 ) RK schemes that are 
functions of the parameter a. Figure 3 illustrates why the analytic schemes presented earlier do not 
have desirable stability characteristics. The stability envelope for the analytic case a « =£ is about 
half the size of the optimal a = the analytic case a « T h as a vanishingly small inviscid 
stability limit. 


The optimal value a — 255 is used to provide the 20 th nonlinear equation in 20 unknowns. The 
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general solution is no longer a one-parameter family and, because of the nonlinearity in the equations, 
is not unique. Specifically, the 20th equation becomes 


5 

^ o t ,i ® j, fc ® fc,/ Q 


1 

200 


(13) 


To solve the 20 equations in 20 unknowns, the partial linearity of the system is exploited to reduce 
it to nine equations in the variables Aj,Bj,j = 1,5 with A\ = 0. This procedure is equivalent to 
substituting the values of a,*j, 6j, and c t - found in equations (12) into equations (11) and (13). The 
Jacobian of the system is calculated analytically, and then a factored secant update is used to obtain 
a solution. Because the equations are nonlinear, multiple roots exist. Some roots were found with 
the analytic solutions as an initial guess. Others were found with random initial guesses. 


At least nine real roots have been identified that satisfy all nine equations. In general, none of 
these nine formulations is optimal for every problem. We can, however, use heuristic arguments to 
identify certain roots as less desirable. Verner [3] cites several theoretical considerations that should 
be used in determining desirable roots. Those that are relevant to this work are 


I Each intermediate time level (c t -, i = 1,5) should be in the interval [0,1] to control the effect of 
rapidly changing derivatives. 

II The weights of the bj,j — 1,5 should be positive. 

Ill Coefficients should incorporate rational numbers requiring a small number of digits. 


Four of the nine roots satisfy condition I; only one satisfies conditions I and II. We were not able to 
express any of the roots in terms of “convenient” rational numbers. 

Table 1 shows the four roots which satisfy the condition 0 < c t < 1, for i - 1,5. Note that they 
all have monotonicaliy increasing values of c t . Solution 3 satisfies the constraint that all bj > 0. 
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Table 1. Four Sets of Coefficients For optimal (5,4) 2N-Storage RK Schemes 


COEF 

SOLUTION 1 

SOLUTION 2 

SOLUTION 3 

SOLUTION 4 

Ai 

0 

0 

0 

0 

A2 

-0.4812317431372 

-0.4801594388478 

-0.4178904745 

-0.7274361725534 

A3 

-1.049562606709 

-1.4042471952 

-1.192151694643 

-1.906288083353 

A4 

-1.602529574275 

-2.016477077503 

-1.697784692471 

-1.444507585809 

A5 

-1.778267193916 

-1.056444269767 

-1.514183444257 

-1.365489400418 

Bi 

9.7618354692056# -2 

0.1028639988105 

0.1496590219993 

4.1717869324523#- 2 

b 2 

0.4122532929155 

0.7408540575767 

0.3792103129999 

1.232835518522 

B3 

0.4402169639311 

0.7426530946684 

0.8229550293869 

0.5242444514624 

b 4 

1.426311463224 

0.4694937902358 

0.6994504559488 

0.7212913223969 

B 5 

0.1978760537318 

0.1881733382888 

0.1530572479681 

0.2570977031703 

Cl 

0 

0 

0 

0 

C 2 

9.7618354692056# — 2 

0.1028639988105 

0.1496590219993 

4.1717869324523#- 2 

C3 

0.3114822768438 

0.487989987833 

0.3704009573644 

0.377744236865 

C 4 

0.5120100121666 

0.6885177231562 

0.6222557631345 

0.6295990426348 

£5 

0.8971360011895 

0.9023816453077 

0.9582821306748 

0.8503409780005 


Other values of the parameter a produce similar results, although we have not been able to find 


roots that satisfy condition III. 


The floating point numbers can be expressed as the fractions of integer numbers. For solution 3, 
the rational form of the coefficients with 26-digit precision becomes 


il 

0 

A 2 — 

567301805773 . 
1357537059087’ 

CO 

II 

2404267990393. 

2016746695238’ 

yt 4 = 

3550918686646. 

2091501179385’ 

11 

iO 

1275806237668. 
842570457699 ’ 

D _ 1432997174477. 
XJ1 ~ 9575080441755' 

b 2 = 

5161836677717 . 
13612068292357’ 

#3 = 

1720146321549. 

2090206949498’ 

# 4 = 

3134564353537. 

4481467310338’ 

b 5 = 

2277821191437 . 
14882151754819’ 

Ci = 0; 

c 2 - 

1432997174477. 

9575080441755’ 

c 3 = 

2526269341429. 

6820363962896’ 

c 4 = 

2006345519317. 

3224310063776’ 

C 5 = 

2802321613138 

2924317926251 


and can be used on machines that have up to 26 significant digits without appreciable roundoff 
errors in the coefficients. 


Section 5: Accuracy and Efficiency of (5,4) RK Schemes 


The four new (5,4) 2N-storage schemes given in Table 1 are compared with the classical (4,4) 3N- 
storage RK scheme, the (3,3) 2N-storage RK scheme advocated by Williamson [1] (presented in 
equation (9)), and the new (4,3) 2N-storage RK scheme presented in equation (10). We begin with 
a nonlinear ODE used by Dormand et al. [8] to test the accuracy of various RK schemes. The ODE 
is defined by y' = y cos(x ), y(0) = 1 on the interval 0 < x < 20, with the exact solution 

y(x) = exp 5,n ( x ). Figure 4 shows a convergence study for all seven schemes. The two third-order 
schemes ((3,3), and (4,3)) approach the exact solution with a slope of -3; the classical fourth-order 
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(4,4) scheme and the four (5,4) schemes approach the exact solution with a slope of -4. Note that the 

(3.3) scheme is more accurate than the (4,3) scheme (Williamson claims the (3,3) scheme is optimal 
in terms of error) and that all (5,4) schemes are more accurate than the (4,4) scheme. Relatively 
little difference exists between the four (5,4) solutions, and none of the (5,4) schemes is uniformly 
optimal over the entire range of the refinement study in terms of error. 

A second nonlinear test problem is used to establish problem-dependent trends. The ODE is 
defined by y' - y n sin n ~ 1 (x)cos(x), y( 0) = 1 on the interval 0 < x < 20, with the exact solution 
y(x) = exp s,Tl where the exponent n is n = 4. Figure 5 shows a convergence study for all seven 
of the schemes. The convergence rates for the 2N-storage schemes are consistent with the theoretical 
predictions, but the conventional (4,4) scheme converges at a rate that approaches -5. This rate 
cannot generally be expected. If the four (5,4) 2N-storage schemes are compared between the two 
nonlinear problems, solution 3 and solution 1 are most accurate in the first and second problems, 
respectively. No clear advantage is evident for using any particular one of the four (5,4) 2N-storage 
schemes. 

This study verifies the nonlinear accuracy of the newly developed 2N-storage schemes for ODE’s. 
Note that in both test problems, Williamson’s (3,3) scheme is more accurate and requires one-third 
fewer function evaluations than the (4,3) scheme, although no attempt was made to optimize the 

(4.3) schemes in terms of error. The (5,4) schemes are more accurate than the conventional (4,4) RK 
scheme in the first problem and more accurate over a significant portion of the second problem. The 

(5.4) schemes appear to be more adversely affected by roundoff errors than the conventional (4,4) 
scheme. 

Our second problem is the solution of the linear hyperbolic equation defined by 


du 

du 


dt 

-f — = 0, 0 < x < M > 0 

(14) 


u(0, t) = sin27r(— £), £>0 

(15) 


u(x,0) = sin27r(a;), 0 < x < 1 

(16) 


The exact solution is 


u(x,t) — sin27r (x — t), 0<x<l,t>0 

and is a model for the class of problems that the (5,4) 2N-storage schemes were developed to integrate. 

Equations (14) through (16) are solved with the same four RK schemes used in the nonlinear test 
problems; specifically, the (3,3) and (4,3) third-order RK schemes and the (4,4) and (5,4) RK schemes. 
In all cases, the spatial operator used is the sixth-order compact scheme developed by Carpenter et 
al. [7] and shown to be formally sixth-order accurate. The physical boundary condition is imposed 
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by solving the differentiated boundary condition on the boundary with the RK procedure. This 
technique was shown by Carpenter et al. [9] to yield a fourth-order temporally accurate procedure. 
Specifically, the boundary condition is d 3 u(0,t)/dt 3 = g"'(t), where g is the physical boundary 
condition at the inflow plane. The CFL that governs the stability of the hyperbolic problem is a 
function of the temporal advancement and the spatial discretization used. Table 1 in the appendix 
compares the inviscid CFL’s and the viscous stability limits of the (3,3), (4,4), and (5,4) RK schemes 
with various spatial operators, including the sixth-order compact spatial operator. 

After grid refinement with a vanishingly small CFL, all schemes recover the theoretical spatial 
sixth-order accuracy. On a specific grid, temporal refinement showed fourth-order temporal accuracy. 
Table 2 shows the results from a grid- refinement study performed with a CFL of f (about — of the 
theoretical maximum CFL) and the (5,4) RI< scheme (solution 3). 

Table 2. Grid refinement at CFL = § for (5,4) 2N-storage RI< schemes 


Grid 

log L 2 

Rate 

21 

-2.4026 


41 

-3.3835 

3.26 

81 

-4.5791 

3.97 

161 

-5.7823 

4.00 

321 

-6.9860 

4.00 

641 

-8.1962 

4.02 


The new scheme is clearly fourth-order accurate for the linear hyperbolic problem. The log L 2 
error of all four (5,4) schemes differed in the third significant digit on all grids. In addition, all four 
schemes were stable up to but not greater than the theoretical stability limit of CFL max - 1.67. 
(See the appendix.) 

The relative efficiency of each scheme is not addressed in any of the previously discussed test 
problems. The advantage in terms of accuracy for the (5,4) schemes is partially lost due to increased 
function evaluations. To quantify this trade-off, a comparison of accuracy and cost is made between 
the (3,3), (4,3), (4,4), and (5,4) RK schemes. The test problem is again the hyperbolic wave equation; 
however, this time the schemes are compared on an equal-cost basis. Here, cost is defined as the 
number of times the spatial terms are evaluated per unit time interval. The (3,3) RK scheme was 
run at a CFL of both (4,4) RK schemes were run at a CFL of jr and the (5,4) RK schemes were 
run at a CFL of £, where 4 < k < 16. The resulting schemes are all numerically stable, and 
temporal eirors vaiy by a factor of 256 for the fourth-order case. The effective cost per unit time 
was, therefoie, identical. All schemes were run to a physical time of t = 50. A grid refinement study 
was also performed to determine the order of accuracy. For all cases, the time asymptotically stable, 
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sixth-order compact spatial operator [7] was used with the boundary conditions that were previously 
described. 


Tables 3(a) and 3(b) show a grid refinement study at two different CFL numbers, the first of 
which is near the CFL max and the second of which is at half the value of the first. Only the (4,4) RK 
results are presented here because the (3,4) and the (4,4) RK schemes produce the same results to 
five significant digits. The results for the (5,4) scheme come from solution 3. (This study is a linear 
advection problem for which the four-stage third-order scheme performs to fourth-order accuracy.) 

Table 3(a). Grid-Refinement Study for Three RK Schemes on Linear Advection Equation U t -f 
U x = 0, Run at Equal Cost Near Maximum CFL 



(3,3) (2N) 


(4,4) (3N) 


(5,4) (2N) 


Grid 

log T 2 

Rate 

log L 2 

Rate 

log L 2 

Rate 

41 

-2.8082 


-3.6932 


-3.7086 


81 

-3.7341 

3.08 

-4.8877 

3.97 

-4.8996 

3.95 

161 

-4.6494 

2.43 

-6.0928 

4.00 

-6.1039 

4.00 

321 

-5.5567 

3.01 

-7.2969 

4.00 

-7.3078 

4.00 

641 

-6.4625 

3.01 

-8.5054 

4.01 

-8.5320 

4.07 


Table 3(b). Grid-Refinement Study for Three RK Schemes on Linear Advection Equation U t + 
U x — 0, Run at Equal Cost About Half Maximum CFL 



(3,3) (2N) 


(4,4) (3N) 


(5,4) (2N) 


Grid 

log L 2 

Rate 

log L 2 

Rate 

log L 2 

Rate 

41 

-3.6257 


-4.2404 


-4.2474 


81 

-4.6231 

3.31 

-5.7992 

5.18 

-5.8032 

5.17 

161 

-5.5515 

3.08 

-7.2519 

4.83 

-7.2661 

4.85 

321 

-6.4611 

3.02 

-8.5021 

4.15 

-8.5284 

4.21 

641 

-7.3659 

3.01 

-10.6444 

7.11 

-9.9965 

4.88 


The (5,4) and the (4,4) schemes are nearly identical when compared at equal cost. The (5,4) 
schemes are more susceptible to roundoff errors, as can be identified when the absolute error is near 
machine precision. 

These test problems demonstrate the efficacy of the (5,4) 2N-storage schemes for nonlinear ODE’s 
and the linear PDE s. These results are expected to extend to nonlinear PDE’s. If the overriding 
concern in the choice of integrator is the reduction of storage, then the newly developed (5,4) schemes 
clearly outperform Williamson’s (3,3) 2N-storage scheme in terms of accuracy and efficiency. The 





accuracy per unit cost is comparable with that of the standard four-stage fourth-order (4,4) RK 
scheme. 


Conclusions 


Several new five-stage fourth-order (5,4) Runge-Kutta (RK) schemes are derived that require only two 
storage locations. Two of the schemes have analytic coefficients that facilitate simple implementation, 
but neither have desirable stability characteristics. A particular scheme is identified (by numerically 
solving the nonlinear equations) which has the desirable efficiency characteristics for hyperbolic and 
parabolic initial (boundary) value problems. In the inviscid and viscous limits, this new (5,4) 2N- 
storage RK scheme has greater accuracy for a given step size and has a larger allowable stability 
domain than the (3,3) 2N-storage RK scheme advocated by Williamson. The new (5,4) RK scheme 
is comparable with the standard (4,4) RK scheme in terms of accuracy to work ratio and is nearly 
as efficient in an absolute sense. Numerical tests are presented that verify these results on nonlinear 
ordinary differential equations (ODE’s) and linear hyperbolic equations. 
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Appendix I 

Table Al shows the inviscid and viscous stability limits obtained from the equation u t + au x — 
au xx , discretized with various spatial and temporal techniques. The inviscid limit is the special case 
in which a = 0; the viscous limit is the special case in which a — 0. The first column describes the 
order of the spatial operator. The letter in parentheses describes the bandwidth of the left-hand-side 
matrix necessary for compact schemes. Specifically, ( E ), (T), and ( P ) signify one-diagonal (explicit), 
tridiagonal, and pentadiagonal matrices, respectively. The sixth-order (T) is the spatial scheme that 
is used in all the hyperbolic studies presented in this work. Columns 2-4 report the inviscid stability 
limits of the (3,3), (4,4), and (5,4) RK schemes, respectively. Columns 5-8 report the viscous stability 
limits of the (3,3), (4,4), and (5,4) RK schemes, respectively. 

For the inviscid case with a sixth-order compact spatial operator (6T), the (3,3) scheme has an 
effective CFL/stage of ^ = 0.289; the (4,4) and (5,4) schemes are ^ = 0.354 and = 0.334, 
respectively. In an absolute sense, the (4,4) RK scheme is the most efficient. Relative to each other, 
the efficiencies of the (3,3), (4,4), and (5,4) RK schemes are 0.816 : 1.0 : 0.945, respectively, if we 
assume that the (4,4) scheme has been assigned an efficiency of 1. (Higher numbers are better.) 

The viscous CFL/stage for each of the three methods with the (6T) spatial differencing is 2^2° = 
0.210, 2^92 = 0.175, and = 0.234, respectively. Here, the (5,4) scheme is most efficient, followed 
by the (3,3) scheme. The relative efficiencies, if we assume that the (4,4) scheme has an absolute 
efficiency of 1.00, becomes 1.20 : 1.00 : 1.34, respectively, for the (3,3), (4,4) and (5,4) schemes. 

For both the inviscid and viscous limits, the (5,4) 2N-storage RI( scheme outperforms the (3,3) 
2N-storage scheme proposed by Williamson. In the viscous case, the (5,4) scheme dramatically out- 
performs the conventional (4,4) RK scheme. In the inviscid case, the (5,4) RK scheme is competitive 
but not quite as efficient as the (4,4) RI< scheme. 
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Table Al: Inviscid and Viscous CFL limits 



Invis 

Stab 

Limit 

Vis 

Stab 

Limit 

Number of stages 

3 

4 

5 

3 

4 

5 

Order of accuracy 

3 

4 

4 

3 

4 

4 

2nd (E) 

V3 

2%/2 

3.34 

2.51 

2.78 

4.65 

4th (E) 

1.26 

2.06 

2.43 

1.33 

1.47 

2.47 

4th (T) 

1 

2y/2 

y/3 

1.92 

0.83 

0.92 

1.55 

6th ( E ) 

1.09 

1.78 

2.10 

0.99 

1.10 

1.85 

6th (T) 

2 

V2 

1.67 

0.63 

0.70 

1.17 

6th (P) 

0.88 

1.44 

1.71 

0.65 

0.73 

1.22 

8th (E) 

1.00 

1.63 

1.93 

0.83 

0.92 

1.55 

8th (T) 

0.81 

1.32 

1.56 

0.55 

0.61 

1.02 

8th ( P ) 

0.78 

1.28 

1.51 

0.51 

0.57 

0.95 

10th ( E ) 

0.94 

1.53 

1.81 

0.74 

0.82 

1.37 

10th (T) 

0.77 

1.26 

1.49 

0.50 

0.56 

0.93 

10th (P) 

0.74 

1.21 

1.43 

0.46 

0.51 

0.86 

Fourier 

& 

Z 

2 72 

t r 


0.25 

0.28 

0.47 
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FIGURE 1. Gross parametric study (in variable a) of stability envelope for RKf5 41 schemes 
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FIGURE 3. Comparison of optimized RK[5,4] schemes with conventional RK[3,3] and RK[4,4] 
schemes. 
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